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Amendments to the Specification 



Paragraph from page 4, lines 21-23 



^ V Viseme- the minimum distin^ive visual manifestation of an 

^J^^.i'V acoustic identification of an articulatory type) [[.]] 



Rep 



resentation in a video jz^ motion picture, 
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Paragraph from page 5, line 1- page 6, line 5 



patent ai 




The present invention ta^es advantage of the 
advancements achieved in the field /f visual information, or 
visemes, in speech recognition, wtich are the subject of 
co-pending U.S. Patent application Lrial No. 09/452,919 filed 
Decend>er 2, 1999 (Y0999-428} entitled «Late Integration in 
Audio-Visual continuous Speech Re/cognition" by Verma, et al; 
patent application Serial No: o//369,707 (Y0999.317, entitled 
and Apparatus for Aud/o-Visual Speech Detection and 
on" by S. Basu, et/al; and Serial No: 09/369,706 
(Y0999-318) entitled "Methods/ and Apparatus for Audio-Visual 
speaker Recognition and Utterance Verification" by S. Basu, et al 
n^.., n, c;. pat e -^ ^.7.19. Jo which issued nn 17 April 2001 . As 

detailed therein, visual / information, such as the mouth 
parameters of height, wid/th, and area, along with derivative 
image information are u/ed to continuously recognize speech, 
particularly in a non-/ont rolled environment which may have 
multiple extraneous noi/e sources. Further to the enhancement of 
speech recognition us/ng facial analysis (see: the 6,219,640 
patent 0Q/3 G g>707 o p p/ication ) and the speaker recognition using 
audio and visual rec/gnition techniques (the .OOn c O,10C 6 .219,640 
patent appiiea^)/ the Verma patent application focusses on the 
fusion (or alignme/t) of data output from a visual recognizer and 
audio recognizer /to improve speech recognition accuracy and to 
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provide automatic speech detection. Mo^ particularly, the Verma 
patent application processes a video /ignal to identify a class 
of the most likely, visemes found inAe signal. Thereafter, the 
most likely phones and/or phonemesAssociated with the identified 
visemes, or with the audio sXl, are considered for audio 
recognition purposes. Therefo/e, the system and method of the 
Verma patent applications use/both audio and video processing to 
discern phones produced by /he subject, and the phones are, in 
turn, linked together to discern words. 
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Paragraph from page 14, line 4-pagQ 15, line 2 

"iTTs noteworthy that the synchronization algorithm can be 
applied, as desired, to prerecorded a/Gdiovisual materials; or it 
can be applied on-the-fly and coj/tinuously to, for example, 
«live" audiovisual materials. Alsc/, although this invention has 
been described using English-langufage examples, there is nothing 
that restricts it to English anJ it can be implemented for any 
language. Finally, it should /be understood that, while the 
highest visual recognition adfcuracy has been realized using 
acial features linked to sprfech, it is possible to recognize 
non-speech acoustic signature/, to link those non-speech acoustic 
signatures to non-speech/ visual "cues" (for example, 
hand-clapping), to time-s/amp the audio and visual output 
streams, and to synchroni/e the audio and video based on the 
identified cues in the tinfe stamped output streams. Under such a 
visual recognition scena/io, the process flow of Fig. 2 would be 
generalized to the stepi of image extraction, feature detection, 
feature parameter analysis, and correlation of acoustic 
signatures stored in / database to the feature parameters. For a 
detailed discussion Jf the training and use of speech recognition 
means for identifying audio sources by acoustic signatures, 
please see co-pend|ing patent application Serial No: 09/602, 452 , 
(YOR9-2000-0130) /entitled ^System and Method for Control of 
Lights, Signals,/ Alarms Using Sound. Detection" by W. Ablondi, et 
al, the teachin/s of which are herein incorporated by reference. 
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